MAVL and StickWRLD: visually exploring relationships in nucleic acid sequence alignments

نویسنده

  • William C. Ray
چکیده

Many powerful tools have been created to detect and describe the similarities between nucleic acid or protein sequences. Frequently these take the form of a sequence consensus, expressing simple most popular positional identities, positional identities with allowances for varying positions or some type of statistical description of the positional frequency characteristics of the defining sequence family. Despite the fact that some provide intuitively interpretable descriptions of the consensuses themselves, they typically do not give the viewer any information about regions of the sequence that might have inter-positional dependencies, and that therefore do not obey a strict consensus behavior. Herein, we present MAVL (Multiple Alignment Variation Linker) and StickWRLD. MAVL is our web-based application for detecting and displaying both positive and negative inter-positional correlations in nucleic acid sequences. MAVL examines all positional pairs in each of a collection of pre-aligned sequences and determines any pairs that occur with either greater or lesser frequency than a positional frequency matrix would predict. These data are then composited into a StickWRLD representation and supplied back to the user as a VRML (virtual reality modeling language) file. MAVL and StickWRLD can be accessed at http://www.microbial-pathogenesis.org/stickwrld/. A tutorial that explains MAVL features and demonstrates typical user interactions with StickWRLD graphs is available at http://www.microbial-pathogenesis.org/stickwrld/tutorial/sticktut2.html. This tutorial is quite large; please be patient while it loads.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MAVL/StickWRLD for protein: visualizing protein sequence families to detect non-consensus features

A fundamental problem with applying Consensus, Weight-Matrix or hidden Markov models as search tools for biosequences is that there is no way to know, from the model, if the modeled sequences display any dependencies between positional identities. In some instances, these dependencies are crucial in correctly accepting or rejecting other sequences as members of the family. MAVL (multiple alignm...

متن کامل

MAVL/StickWRLD: analyzing structural constraints using interpositional dependencies in biomolecular sequence alignments

The increasing availability of structurally aligned protein families has made it possible to use statistical methods to discover regions of interpositional dependencies of residue identity. Such dependencies amongst residues often have structural or functional implications, and their discovery can supply valuable constraints that assist in the refinement of measured, or predicted molecular stru...

متن کامل

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues

Protein alignments are commonly used to evaluate the similarity of protein residues, and the derived consensus sequence used for identifying functional units (e.g., domains). Traditional consensus-building models fail to account for interpositional dependencies - functionally required covariation of residues that tend to appear simultaneously throughout evolution and across the phylogentic tree...

متن کامل

Phylogenetic and sequence analysis of the growth hormone gene of two sturgeons, Huso huso and Acipenser Gueldenstaedtii

In this study, the cDNA Growth Hormone (cGH) of the Belugasturgeon (Husohuso) and Russian sturgeon (Acipensergueldenstaedtii) were cloned and sequenced, and phylogenetic relationships were examined using nucleic acid and amino acid sequences. The nucleotide sequence of the Beluga GH has an open reading frame of 645 nucleotides encoding a protein 214 amino acid residues. The signal peptide cleav...

متن کامل

PairsDB atlas of protein sequence space

Sequence similarity/database searching is a cornerstone of molecular biology. PairsDB is a database intended to make exploring protein sequences and their similarity relationships quick and easy. Behind PairsDB is a comprehensive collection of protein sequences and BLAST and PSI-BLAST alignments between them. Instead of running BLAST or PSI-BLAST individually on each request, results are retrie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Nucleic acids research

دوره 32 Web Server issue  شماره 

صفحات  -

تاریخ انتشار 2004